85 research outputs found

    A User Study for Evaluation of Formal Verification Results and their Explanation at Bosch

    Full text link
    Context: Ensuring safety for any sophisticated system is getting more complex due to the rising number of features and functionalities. This calls for formal methods to entrust confidence in such systems. Nevertheless, using formal methods in industry is demanding because of their lack of usability and the difficulty of understanding verification results. Objective: We evaluate the acceptance of formal methods by Bosch automotive engineers, particularly whether the difficulty of understanding verification results can be reduced. Method: We perform two different exploratory studies. First, we conduct a user survey to explore challenges in identifying inconsistent specifications and using formal methods by Bosch automotive engineers. Second, we perform a one-group pretest-posttest experiment to collect impressions from Bosch engineers familiar with formal methods to evaluate whether understanding verification results is simplified by our counterexample explanation approach. Results: The results from the user survey indicate that identifying refinement inconsistencies, understanding formal notations, and interpreting verification results are challenging. Nevertheless, engineers are still interested in using formal methods in real-world development processes because it could reduce the manual effort for verification. Additionally, they also believe formal methods could make the system safer. Furthermore, the one-group pretest-posttest experiment results indicate that engineers are more comfortable understanding the counterexample explanation than the raw model checker output. Limitations: The main limitation of this study is the generalizability beyond the target group of Bosch automotive engineers.Comment: This manuscript is under review with the Empirical Software Engineering journa

    Runtime Verification of Self-Adaptive Systems with Changing Requirements

    Full text link
    To accurately make adaptation decisions, a self-adaptive system needs precise means to analyze itself at runtime. To this end, runtime verification can be used in the feedback loop to check that the managed system satisfies its requirements formalized as temporal-logic properties. These requirements, however, may change due to system evolution or uncertainty in the environment, managed system, and requirements themselves. Thus, the properties under investigation by the runtime verification have to be dynamically adapted to represent the changing requirements while preserving the knowledge about requirements satisfaction gathered thus far, all with minimal latency. To address this need, we present a runtime verification approach for self-adaptive systems with changing requirements. Our approach uses property specification patterns to automatically obtain automata with precise semantics that are the basis for runtime verification. The automata can be safely adapted during runtime verification while preserving intermediate verification results to seamlessly reflect requirement changes and enable continuous verification. We evaluate our approach on an Arduino prototype of the Body Sensor Network and the Timescales benchmark. Results show that our approach is over five times faster than the typical approach of redeploying and restarting runtime monitors to reflect requirements changes, while improving the system's trustworthiness by avoiding interruptions of verification.Comment: 18th Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS 2023

    A Property Specification Pattern Catalog for Real-Time System Verification with UPPAAL

    Full text link
    Context: The goal of specification pattern catalogs for real-time requirements is to mask the complexity of specifying such requirements in a timed temporal logic for verification. For this purpose, they provide frontends to express and translate pattern-based natural language requirements to formulae in a suitable logic. However, the widely used real-time model checking tool UPPAAL only supports a restricted subset of those formulae that focus only on basic and non-nested reachability, safety, and liveness properties. This restriction renders many specification patterns inapplicable. As a workaround, timed observer automata need to be constructed manually to express sophisticated requirements envisioned by these patterns. Objective: In this work, we fill these gaps by providing a comprehensive specification pattern catalog for UPPAAL. The catalog supports qualitative and real-time requirements and covers all corresponding patterns of existing catalogs. Method: The catalog we propose is integrated with UPPAAL. It supports the specification of qualitative and real-time requirements using patterns and provides an automated generator that translates these requirements to observer automata and TCTL formulae. The resulting artifacts are used for verifying systems in UPPAAL. Thus, our catalog enables an automated end-to-end verification process for UPPAAL based on property specification patterns and observer automata. Results: We evaluate our catalog on three UPPAAL system models reported in the literature and mostly applied in an industrial setting. As a result, not only the reproducibility of the related UPPAAL models was possible, but also the validation of an automated, seamless, and accurate pattern- and observer-based verification process. Conclusion: The proposed property specification pattern catalog for UPPAAL enables practitioners to specify qualitative and real-time requirements...Comment: Accepted Manuscrip

    Tests4Py: A Benchmark for System Testing

    Full text link
    Benchmarks are among the main drivers of progress in software engineering research, especially in software testing and debugging. However, current benchmarks in this field could be better suited for specific research tasks, as they rely on weak system oracles like crash detection, come with few unit tests only, need more elaborative research, or cannot verify the outcome of system tests. Our Tests4Py benchmark addresses these issues. It is derived from the popular BugsInPy benchmark, including 30 bugs from 5 real-world Python applications. Each subject in Tests4Py comes with an oracle to verify the functional correctness of system inputs. Besides, it enables the generation of system tests and unit tests, allowing for qualitative studies by investigating essential aspects of test sets and extensive evaluations. These opportunities make Tests4Py a next-generation benchmark for research in test generation, debugging, and automatic program repair.Comment: 5 pages, 4 figure

    QuantUM: Quantitative Safety Analysis of UML Models

    Full text link
    When developing a safety-critical system it is essential to obtain an assessment of different design alternatives. In particular, an early safety assessment of the architectural design of a system is desirable. In spite of the plethora of available formal quantitative analysis methods it is still difficult for software and system architects to integrate these techniques into their every day work. This is mainly due to the lack of methods that can be directly applied to architecture level models, for instance given as UML diagrams. Also, it is necessary that the description methods used do not require a profound knowledge of formal methods. Our approach bridges this gap and improves the integration of quantitative safety analysis methods into the development process. All inputs of the analysis are specified at the level of a UML model. This model is then automatically translated into the analysis model, and the results of the analysis are consequently represented on the level of the UML model. Thus the analysis model and the formal methods used during the analysis are hidden from the user. We illustrate the usefulness of our approach using an industrial strength case study.Comment: In Proceedings QAPL 2011, arXiv:1107.074

    Semantic Debugging

    Get PDF
    Why does my program fail? We present a novel and general technique to automatically determine failure causes and conditions, using logical properties over input elements: "The program fails if and only if int(⟨length⟩) > len(⟨payload⟩) holds - that is, the given ⟨length⟩ is larger than the ⟨payload⟩ length." Our AVICENNA prototype uses modern techniques for inferring properties of passing and failing inputs and validating and refining hypotheses by having a constraint solver generate supporting test cases to obtain such diagnoses. As a result, AVICENNA produces crisp and expressive diagnoses even for complex failure conditions, considerably improving over the state of the art with diagnoses close to those of human experts
    corecore